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Abstract 

We propose a class of Item Response Theory models for items with ordinal poly- 
tomous responses, which extends an existing class of multidimensional models for 
dichotomously-scored items measuring more than one latent trait. In the proposed 
approach, the random vector used to represent the latent traits is assumed to have 
a discrete distribution with support points corresponding to different latent classes 
in the population. We also allow for different parameterizations for the conditional 
distribution of the response variables given the latent traits - such as those adopted 
in the Graded Response model, in the Partial Credit model, and in the Rating Scale 
model - depending on both the type of link function and the constraints imposed on 
the item parameters. For the proposed models we outline how to perform maximum 
likelihood estimation via the Expectation-Maximization algorithm. Moreover, we 
suggest a strategy for model selection which is based on a series of steps consisting 
of selecting specific features, such as the number of latent dimensions, the number 
of latent classes, and the specific parametrization. In order to illustrate the pro- 
posed approach, we analyze data deriving from a study on anxiety and depression 
as perceived by oncological patients. 
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1 Introduction 



Item Response Theory (IRT) models are commonly used to analyze data deriving from 
the administration of questionnaires made of items with dichotomous or polytomous re- 
sponses (also known, in the educational setting, as dichotomously or polytomously-scored 
items). Dichotomous responses are usually labelled as true or false, right or wrong, yes 
or no, whereas polytomous responses correspond to more than two options. Polytomous 
responses include both nominal and ordinal responses. In the former, there is no natural 
ordering in the item response categories. In the latter, which are of our interest here, 
each item has responses corresponding to a number of ordered categories (e.g., correct, 
partially correct, wrong). While nominal polytomous items are especially used to inves- 
tigate customers' choices and preferences, ordinal polytomous items are widespread in 
several contexts, such as in education, marketing, and psychology. For a review about 
polytomous IRT models, see Hambleton and Swaminathan (1985), Van der Linden and 
Hambleton (1997), and Nering and Ostini (2010). 

A number of models have been proposed in the psychometrical and statistical litera- 
ture to analyze items with ordinal polytomous responses, and several taxonomies can be 
adopted. Among the most known, we remind those due to Samejima (1972), Molenaar 
(1983), and Thissen and Steinberg (1986) which, even though developed independently 
one another, are strongly related and overlapping (Samejima, 1996; Hemker et al., 2001). 
Combining these parameterizations with possible constraints on item discriminating and 
difficulty parameters, the most well known IRT models for polytomous responses result, 
such as the Graded Response model (GRM; Samejima, 1969), the Partial Credit model 
(PCM; Masters, 1982), the Rating Scale model (RSM; Andrich, 1978), and the Gener- 
alized Partial Credit Model (GPCM; Muraki, 1992). These models are based on the 
unidimensionality assumption and, for some of them, the normality assumption of this 
latent trait is explicitly introduced. 

Several extensions of traditional IRT models for polytomous responses have been pro- 
posed in the literature in order to overcome some restrictive assumptions and to make 
the models more flexible and realistic. Firstly, some authors dealt with multidimensional 
extensions of IRT models to take into account that questionnaires are often designed to 
measure more than one latent trait. Among the main contributions in the context of IRT 
models for polytomous responses, we remind Duncan and Stenbeck (1987), Agresti (1993) 
and Kelderman and Rijkes (1994), who proposed a number of examples of loglinear mul- 
tidimensional IRT models, Kelderman (1996) for a multidimensional version of the PCM, 
and Adams et al. (1997) for a wide class of Rasch type (Rasch, 1960; Wright and Masters, 
1982) extended models; see Reckase (2009) for a thorough overview of this topic. 

Another advance in the IRT literature concerns the assumption that the population 
under study is composed by homogeneous classes of individuals who have very similar 
latent characteristics (Lazarsfeld and Henry, 1968; Goodman, 1974). In some contexts, 
where the aim is to cluster individuals, this is a convenient assumption; in health care, 
for instance, by introducing this assumption we single out a certain number of clusters 
of patients receiving the same clinical treatment. Secondly, this assumption allows us to 
estimate the model in a semi-parametric way, namely without formulating any assump- 
tion on the latent trait distribution. Moreover, it is possible to implement the maximum 
marginal likelihood method making use of the Expectation-Maximization (EM) algorithm 
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(Dempster et al., 1977), skipping in this way the problem of intractability of multidimen- 
sional integral which characterizes the marginal likelihood when a continuous latent vari- 
able is assumed. At this regards, Christensen et ah (2002) outline, through a simulation 
study, the computational problems encountered during the estimation process of a multi- 
dimensional model based on a multivariate normally distributed ability. See also Masters 
(1985), Langheine and Rost (1988), Heinen (1996), and Formann (2007) for a comparison 
between traditional IRT models with those formulated by a latent class approach. For 
some examples of discretized variants of IRT models we also remind Lindsay et al. (1991), 
Formann (1992), Hoijtink and Molenaar (1997), Vermunt (2001), and Smit et al. (2003). 
Another interesting example of combination between the IRT approach and latent class 
approach is represented by the mixed Rasch model for ordinal polytomous data (Rost, 
1991; von Davier and Rost, 1995), builded as a mixture of latent classes with a separate 
Rasch model assumed to hold within each of these classes. 

As concerns the combination of the two above mentioned extensions, in the context of 
dichotomously-scored items Bartolucci (2007) proposed a class of multidimensional latent 
class (LC) IRT models, where: (i) more latent traits are simultaneously considered and 
each item is associated with only one of them (between-item multidimensionality - for 
details see Adams et al. (1997); Zhang (2004)) and (ii) these latent traits are represented 
by a random vector with a discrete distribution common to all subjects (each support 
point of such a distribution identifies a different latent class of individuals). Moreover, in 
this class of models either a Rasch (Rasch, 1960) or a two-parameter logistic (2PL) pa- 
rameterization (Birnbaum, 1968) may be adopted for the probability of a correct response 
to each item. Similarly to Bartolucci (2007), von Davier (2008) proposed the diagnostic 
model, which, as main difference, assumes fixed rather than free abilities. An interest- 
ing comparison of multidimensional IRT models based on continuous and discrete latent 
traits was performed by Haberman et al. (2008) in terms of goodness of fit, similarity of 
parameter estimates and computational time required. 

The aim of the present paper is to extend the class of models of Bartolucci (2007) to 
the case of items for ordinal polytomous responses. The proposed extension is formulated 
so that different parameterizations may be adopted for the conditional distribution of the 
response variables, given the latent traits. We mainly refer to the classification criterion 
proposed by Molenaar (1983); see also Agresti (1990) and Van der Ark (2001). Relying on 
the type of link function, it allows to discern among: (i) graded response models, based 
on global (or cumulative) logits; (ii) partial credit models, which make use of local (or 
adjacent category) logits; and {in) sequential models, based on continuation ratio logits. 
For each of these link functions, we explicitly consider the possible presence of constraints 
on item discrimination parameters and threshold difficulties. As concerns the first element, 
we take into account the possibility that all items have the same discriminating power 
against the possibility that they discriminate differently. Moreover, we discern the case 
in which each item differs from the others for different distances between the difficulties 
of consecutive response categories and the special case in which the distance between 
difficulty levels from category to category is the same for all items. On the basis of 
the choice of all the mentioned features (i.e., type of link function, item discriminant 
parameters, item difficulties), different parameterizations for ordinal responses are defined. 
We show how these parameterizations result in an extension of traditional IRT models, 
by introducing assumptions of multidimensionality and discreteness of latent traits. 
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In order to estimate each model in the proposed class, we outline an EM algorithm. 
Moreover, special attention is given to the model selection procedure, that aims at choos- 
ing the optimal number of latent classes, the type of link function, the number of latent 
dimensions and the allocation of items within each dimension, and the parameterization 
for the item discriminating and difficulty parameters. 

In order to illustrate the proposed class of models, we analyze a dataset collected by a 
questionnaire on anxiety and depression of oncological patients, and formulated following 
the "Hospital Anxiety and Depression Scale" (HADS) developed by Zigmond and Snaith 
(1983). Through this application, each step of the model selection procedure is illustrated 
and the characteristics of each latent class, in terms of estimated levels of the latent traits, 
are described with reference to the selected model. 

In summary, the proposed class of models allows for (i) ordinal polytomous responses 
of different nature, (ii) multidimensionality and {in) discreteness of latent traits, at the 
same time. As concerns the first point, our model includes different link functions that 
are suitable for a wide type of empirical data. Moreover, our formulation allows for esti- 
mating both abilities and probabilities, and the introduction of latent classes represents a 
semi-parametric approach that computationally simplifies, through an EM algorithm, the 
maximization of log-likelihood function during the estimation process. To our knowledge, 
there are not other contributions treating all these topics in a same unifying framework, 
even if the single aspects are separately included in several existing types of models, as 
above outlined. 

The reminder of this paper is organized as follows. In Section 2 we describe some 
basic parameterizations for IRT models for items with ordinal responses. In Section 3 we 
describe the proposed class of multidimensional LC IRT models for items with ordinal 
responses. Section 4 is devoted to maximum likelihood estimation which is implemented 
through an EM algorithm; moreover, in the same section we treat the issue of model 
selection. In Section 5, the proposed class of models is illustrated through the analysis of 
a real dataset, whereas some final remarks are reported in Section 6. 

2 Models for polytomous item responses 

Let Xj denote the response variable for the j-th item of the questionnaire, with j = 
l,...,r. This variable has Ij categories, indexed from to Ij — 1. Moreover, in the 
unidimensional case, let 



denote the probability that a subject with latent trait (or ability) level 6 responds by cat- 
egory X to this item. Also let Xj{0) denote the probability vector {Xjo{9), . . . , Xj^i.^i{9))', 
the elements of which sum up to 1. 

The IRT models for polytomous responses that are here of interest may be expressed 
through the general formulation 



where gx{-) is a link function specific of category x and 7j and /3jx are item parameters 



x\Q = 9), X = 0, . . . ,lj — 1, 



9x[>^j{0)] = lj{0 - (3j^), j = 1, . . . ,r, x = l,...,lj-l, 



(1) 
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which are usually identified as discrimination indices and difficulty levels and on which 
suitably constraints may be assumed. 

On the basis of the specification of the link function in (1) and on the basis of the 
adopted constraint on the item parameters, different unidimensional IRT models for poly- 
tomous responses result. In particular, the formulation of each of these models depends 
on: 

1. Type of link function: We consider the link based on: (i) global (or cumulative) 
logits; (a) local (or adjacent categories) logits; and (Hi) continuation ratio logits. In 
the first case, the link function is defined as 



and compares the probability that item response is in category x or higher with the 
probability that it is in a lower category. Moreover, with local logits we have that 



and then the probability of each category x is compared with the probability of the 
previous category. Finally, with continuation ratio logits we have that 



and then the probability of a response in category x is compared with the probability 
associated to the previous category or higher. 

Global logits are typically used when the trait of interest is assumed to be con- 
tinuous but latent, so that it can be observed only when each subject reaches a 
given threshold on the latent continuum. On the contrary, local logits are used to 
identify one or more intermediate levels of performance on an item and to award a 
partial credit for reaching such intermediate levels. Finally, continuation ratio logit 
is useful when sequential cognitive processes are involved (e.g., problem solving or 
repeated trials), how it typically happens in the educational context. Note that 
the interpretation of continuation ratio logits is very different from that of local 
logits. The latter ones describe the transition from one category to an adjacent one 
given that one of these two categories have been chosen. Thus, each of these log- 
its excludes any other categories. Differently, continuation ratio logits describe the 
transition between adjacent categories, given that the smallest between the two has 
been reached. IRT models based on global logits are also known as graded response 
models, those based on local logits are known as partial credit models. Moreover, 
IRT models based on continuation ratio logits are also called sequential models. 

2. Constraints on the discrimination parameters: We consider: (z) a general situation 
in which each item may discriminate differently from the others and (u) a special 




X 1 , . . . , / j 1 , 




X 1, . . . , /j 1, 




X 1, . . . , Zj 1, 
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case in which all the items discriminate in the same way, that is 

7, = 1, j = l,...,r. (2) 

Note that, in both cases, we assume that, within each item, all response categories 
share the same 7j, in order to keep the conditional probabilities away from crossing 
and so avoiding degenerate conditional response probabilities. 

3. Formulation of item difficulty parameters: We consider: (i) a general situation in 
which the parameters (3jx are unconstrained and (ii) a special case in which these pa- 
rameters are constrained so that the distance between difficulty levels from category 
to category is the same for each item (rating scale parameterization). Obviously, 
the second case makes sense when all items have the same number of response 
categories, that is Ij = I, j = 1, . . . , r. This constraint may be expressed as 

I3jx = l3j+Tx, j = l,...,r, x = 0, 1, (3) 

where /3j indicates the difficulty of item j and is the difficulty of response category 
X for all j. 

By combining the above constraints, we obtain four different specifications of the item 
parametrization, based on free or constrained discrimination parameters and on rating 
scale or free parameterization for difficulties. Therefore, also according to the type of link 
function, twelve different types of unidimensional IRT model for ordinal responses result. 
These models are listed in Table 1. 



discrimination difficulty resulting resulting model (depending on the type of logit) 

indices levels parameterization global local continuation 

fPee frii - Pjx) GRM GPCM SM 

free constrained lAO - {^j + t^)] RS-GRM RS-RSM RS-SM 

constrained free - (ij^ IP-GRM PCM SRM 

constrained constrained 9 - {j3j + r^) IP-RS-GRM RSM SRSM 

Table 1: List of unidimensional IRT models for ordinal polytomous responses which result 

from the different choices of the link function, constraints on the discrimination indices, 
and constraints on the difficulty levels. 



Abbreviations used for the models specified in Table 1 refer to the way the corre- 
sponding models are known in the literature. Thus, other than GRM, RSM, PCM, and 
GPCM already mentioned in Section 1 it is possible to identify: SM indicating the Se- 
quential Model obtained as special case of the acceleration model of Samejima (1995), 
where the acceleration step parameter is constrained to one and the discriminant indices 
are all constant over the response categories; RS-GRM indicating the rating scale ver- 
sion of the GRM introduced by Muraki (1990); RS-GPCM and RS-SM that are rating 
scale versions of GPCM (Muraki, 1997) and SM, respectively; IP-GRM (Van der Ark, 
2001), IP-RS-GRM (Van der Ark, 2001), and SRM (Sequential Rasch Model; Tutz, 1990) 
indicating versions with constant discrimination index corresponding to the GRM, RS- 
GRM, and SM models, respectively. Finally, by SRSM we indicate the Sequential Rating 
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Scale Model of Tutz (1990). We observe that Table 1 identifies a hierarchy of models in 
correspondence with each type of link function. 

As an illustration, consider that if we choose a global logit link function and the 
least restrictive parameterization for the item parameters, we obtain the GRM, that 
represents one of the most well known generalization of the 2PL model to items with 
ordinal responses. This generalization is based on the assumption 

Moreover, by combining the local logit link and the most restrictive parameterization for 
the item parameters, the RSM results. It represents an extension of the Rasch model to 
items with ordinal responses, which is based on the assumption 

Since all the models presented in Table 1 can be expressed in terms of nonlinear 
mixed models (Rijmen et al., 2003), a suitable and very common parameter estimation 
method is the maximum marginal log-likelihood (MML) method, which is based on inte- 
grating out the unknown individual parameters, so that only the item parameters need 
to be estimated. To treat the integral characterizing the marginal log-likelihood function, 
different approaches can be adopted (Rijmen et al. (2003) for details). Under the assump- 
tion that the latent trait has a normal distribution, the Gauss-Hermite quadrature can 
be adopted to compute this integral which is then maximized by a direct method (e.g., 
Newton- Raphson algorithm) or indirect (e.g., EM algorithm). Alternatively, we can adopt 
a quasi-likelihood approach or a Bayesian approach based on Markov Chain Monte Carlo 
methods. Once the model parameters have been estimated, person parameters can be 
estimated by treating item parameters as known and maximizing the log-likelihood with 
respect to the latent trait or, alternatively, using the expected value or the maximum 
value of the corresponding posterior distribution. 

Among the above mentioned models, those based on Rasch type parametrization (i.e., 
PCM and RSM) may be also estimated through the conditional maximum likelihood 
(CML; Wright and Masters, 1982) method. This method allows us to estimate the item 
parameters without formulating any assumption on the latent trait distribution. It is 
based on maximizing the log-likelihood conditioned on the individual raw scores that, in 
the case of Rasch type models, represent a sufficient statistics for ability parameters. The 
resulting function only depends on the difficulty parameters that, therefore, can be con- 
sistently estimated. Tutz (1990) proposed a modified version of CML method to estimate 
the SRM and the SRSM. Another estimation method used is the joint or unconditional 
maximum likelihood (Wright and Masters, 1982, for details) which, however, does not 
provide consistent parameter estimates. 
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3 The proposed class of models 



In the following, we describe the multidimensional extension of the unidimensional IRT 
models for ordinal responses mentioned in the previous section, which is based on latent 
traits with a discrete distribution. We first present the assumptions on which the proposed 
class of models is based and, then, a formulation in matrix notation which is useful for 
the estimation. 

We recall that the proposed class of models also represents a generalization to the 
case of ordinal polytomous responses of the class of multidimensional models proposed by 
Bartolucci (2007) for dichotomously-scored items. 

3.1 Basic assumptions 

Let s be the number of different latent traits measured by the items, let = (6i, . . . , O^)' 
be a vector of latent variables corresponding to these latent traits, and let = (6*1, ... , 6s)' 
denote one of its possible realizations. The random vector is assumed to have a discrete 
distribution with k support points, denoted by ^j^, . . . , and probabilities tti, . . . , vr^, 
with TTc = p{@ = ^c). Moreover, let 6jd be a dummy variable equal to 1 if item j measures 
latent trait of type d and to otherwise, with j = 1, . . . , r and d = 1, . . . , s. 

Coherently with the introduction of vector 0, we redefine the conditional response 
probabilities 

x.^(e) = p{Xj = x\@ = o), X = 0, . . . , - 1, 

and we let Xj{6) = (Ajo(^), • • • , ^j,ij-i{.^)y ■ Then, assumption (1) is generalized as follows 

s 

9x{>^j{0)) =-fj{^6jd6d- j = l,...,r, x = - 1, (6) 

d=l 

where the item parameters 7^ and Pj^ may be subjected to the same parametrizations 
illustrated in Section 2. More precisely, on the basis of the constraints assumed on these 
parameters, we obtain different specifications of equation (1) which are reported in Table 
1, where we distinguish the case of s = 1 from that of s > 1. 



discrimination 


difficulty 


JN umber 


of latent traits 


indices 


levels 


s= 1 


s > 1 


free 

free 
constrained 
constrained 


free 
constrained 

free 
constrained 


7j[^-(/3j+T.)] 

9 — /3jx 

9 - (/3,- + Tx) 


liilld^jdOd- iPj+Tx)] 
2Zd ^jd^d - (ijx 

Y^d^.id^d - {Pj +rx) 



Table 2: Resulting item parameterizations for s = 1 and s > 1. 



Each of the item parameterizations shown in Table 2 may be indifferently combined 
both with global, local, and continuation ratio logit link functions to obtain different 
types of multidimensional LC IRT models for ordinal responses, representing as many as 
generalizations of models as in Table 1. For instance, we may define the multidimensional 
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LC versions of GRM, defined tlirougli equation (4), and RSM, defined through equation 
(5), respectively as 



log ~ ^ = ^jdOd - x = l,...,lj-l, (7) 

and 

Note that when Ij = 2, j = 1, . . . , r, so that item responses are binary, equations (7) and 
(8) speciahze, respectively, in the multidimensional LC 2PL model and in the multidi- 
mensional LC Rasch model, both of them described by Bartolucci (2007). 

In all cases, the discreteness of the distribution of the random vector implies that 
the manifest distribution of X = {Xi, . . . jX^)' for all subjects in the c-th latent class is 
equal to 

k 

p{x) = p{X = x) = Y,P{X = x\@ = Ovr,, (9) 

c=l 

where, due to the classical assumption of local independence, we have 

r 

p{x\c)=p{X = x\@ = $„) = J]p(X, =x,|0 = O = 

s 

= n X{p{X,=x,\Qa = U, (10) 

d=i jeJd 

where J'd denotes the subset of ^7 = {1, . . . ,r} containing the indices of the items mea- 
suring the d-th latent trait, with d = 1, . . . , s and C,cd denoting the d-th elements of 

In order to ensure the identifiability of the proposed models, suitable constraints on 
the parameters are required. With reference to the general equation (6), we require that, 
for each latent trait, one discriminant index is equal to 1 and one difficulty parameter is 
equal to 0. More precisely, let jd be a specific element of J^d, say the first. Then, when 
the discrimination indices are not constrained to be constant as in (2), we assume that 

7id = 1, d=l,...,s. 

Moreover, with free item difficulties we assume that 

I3j,, = 0, d=l,...,s, (11) 

whereas with a rating scale parameterization based on (3), we assume 

Pj^ = 0, d=l,...,s, and n = 0. (12) 

Coherently with the mentioned identifiability constraints, the number of free parame- 
ters of a multidimensional LC IRT model with ordinal responses is obtained by summing 



9 



the number of free probabilities tTc, the number of abihty parameters ^cd, the number 
of free item difficulty parameters Pj^, and that of free item discrimination parameters 
7j. We note that the number of free parameters does not depend on the type of logit, 
but only on the type of parametrization assumed on item discrimination and difficulty 
parameters, as shown in Table 3. In any case, the number of probabilities is equal to 
k — 1 and the number of ability parameters is equal to sk. However, the number of free 
item difficulty parameters is given by [X]j=i(^i ~ 1) ^-^l under an unconstrained difficulties 
parameterization and it is given by [(r — s) + (/ — 2)] under a rating scale parameterization. 
Finally, the number of free item discrimination parameters is equal to (r — s) under an 
unconstrained discrimination parameterization, being otherwise. 



discrimination 
indices 


ditticulty 
levels 


Number of free parameters 
(#par) 


free 

free 
constrained 
constrained 


free 
constrained 

free 
constrained 


{k- 
{k- 
(k- 
{k- 


l)+sk+[Y:j=Al,-l)-s\+{r-s) 
l) + sk + [(r -s) + {l- 2)] + (r - s) 

i) + ^fc+[E-=i(^.-i)-^] 

l)+sk+\ir-s) + (l-2)] 



Table 3: Number of free parameters for different constraints on item discrimination and 
difficulty parameters. 



3.2 Formulation in matrix notation 

In order to efficiently implement parameter estimation, in this section we express the 
above described class of models by using the matrix notation. In order to simplify the 
description, we consider the case in which every item has the same number of response 
categories, that is Ij = I, j = 1, . . . , r; the extension to the general case in which items may 
also have a different number of response categories is straightforward. In the following, 
by Oa we denote a column vector of a zeros, by Oab an a x 6 matrix of zeros, by an 
identity matrix of size a, by a column vector of a ones. Moreover, we use the symbol 
Uab to denote a column vector of a zeros with the b-th element equal to one and to 
denote an a x a lower triangular matrix of ones. Finally, by ® we indicate the Kronecker 
product. 

As concerns the link function used in (6), it may be expressed in a general way to 
include different types of parameterizations (Glonek and McCuUagh, 1995; Colombi and 
Forcina, 2001) as follows: 

g[X,{0)] = Chg[MX^{0)], (13) 

where the vector g[\j{6)] has elements gx[Xj{6)] for x = 1, . . . ,1 — 1. Moreover, C is a 
matrix of constraints of the type 
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whereas, for the global logit link, matrix M is equal to 
for the local logit link it is equal to 

^= {o,:[ ?;:;)■ 

and for the continuation ratio logit link it is given by 

How to obtain the probability vector Xj{0) on the basis of a vector of logits defined 
as in (13) is described in Colombi and Forcina (2001), where a method to compute the 
derivative of a suitable vector of canonical parameters for Xj{0) with respect to these 
logits may be found. 

Once the ability and difficulty parameters are included in the single vector (p and 
taking into account that the distribution of has k support points, assumption (6) may 
be expressed through the general formula 

9[>^ji^c)] = ljZcj(p, c = 1, . . . , /c, j = 1, . . . , r, 

where Zcj is a suitable design matrix. The structure of the parameter vector t/? and of 
these design matrices depend on the type of constraint assumed on the difficulty param- 
eters, as we explain below. 

When the difficulty parameters are unconstrained, (/> is a column vector of size sk + 
r{l — 1) — s, which is obtained from 

(^11) • ■ • ) ^Is) • • • ) ^ks, . . . , . . . , /3r,i^-l)' 

by removing the parameters constrained to be 0; see (11). Accordingly, for c = 1, . . . ,k 
and j = 1, . . . ,r, the design matrix Zcj is obtained by removing suitable columns from 
the matrix 

where d is the dimension measured by item j. On the other hand, under a rating scale 
parameterization, is a vector of size sk + {r — s) + {I — 2) which is obtained from 

i^Uj ■ ■ ■ 1 ^is, • • • , ^ks, • • • , /3r, Ti, . . . , n^iY 

by removing the parameters constrained to be in (12). Accordingly, the design matrix 
Zcj is obtained by removing specific columns from 

where, again, d is the dimension measured by item j. 
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4 Likelihood inference 



In this section, we deal with hkehhood inference for the models proposed in the previous 
section. In particular, we first show how to compute the model log-likelihood and how 
to maximize it by an EM algorithm. Finally, we deal with model selection. All the 
computational procedures are implemented in Matlab and R and are available on request 
from authors. 

4.1 Model estimation 

On the basis of an observed sample of dimension n, the log-likelihood of a model formu- 
lated as proposed in Section 3 may be expressed as 

X 

where rj is the vector containing all the free model parameters, Ux is the frequency of the 
response configuration x, p{x) is computed according to (9) and (10) as a function of rj, 
and by we mean the sum extended to all the possible response configurations x. 

In oder to maximize £{r]) with respect to rj we use an EM algorithm (Dempster et al., 
1977) that is implemented in a similar way as described in Bartolucci (2007), to which 
we refer for some details. First of all, denoting by m^a; the (unobserved) frequency of the 
response configuration x and the latent class c, the complete log-likelihood is equal to 

^*(^) = XlZl"^'='=^^°sb(a^|c)vrc]. (14) 

c X 

Now we denote by rj^ the subvector of rj which contains the free latent class probabilities 
and by r/g the subvector containing the remaining free parameters. More precisely, we let 
rji = 77, with TT = {712, ■ ■ ■ , TTfc)', and r]2 = (7', 4>')', where 7 is obtained by removing from 
(71, ... , 7r)' the parameters which are constrained to be equal to 1 to ensure identifiability. 
Obviously, 7 is not present when constraint (2) is adopted. Then, we can decompose the 
complete log-likelihood as 

with 

^liVi) = ^mclogvTc, (15) 

c 

= J2J2<M>^c,, (16) 

c j 

where rric = Ylx ^c,x is the number of subjects in latent class c and rricj is the column 
vector with elements ^i^j ~ ^)'^c,xi x = 1, . . . , — 1, with /(■) denoting the indicator 
function. 

The EM algorithm alternates the following two steps until convergence: 

E-step: compute the conditional expected value of i*{r]) given the observed data and 
the current value of the parameters; 
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M-step: maximize the above expected value with respect to r], so that this parameter 
vector results updated. 



The E-step consists of computing, for every c and x, the expected value of rric^x given 

Tlx 

follows 

_ p{x\c)nc 

and then substituting these expected frequencies in (14). On the basis of rhc^x we can 
obtain the expected frequencies rhc and rricj which, once substituted in (15) and (16), 
allow us to obtain the expected values of iHrji) and ^2(^2)5 denoted by i'*(T7i) and ^2(^2)5 
respectively. 

At the M-step, the function obtained as described above is maximized with respect 
to 77 as follows. First of all, regarding the parameters in rj^ we have an explicit solution 
given by 

The 

vTc = , c = 2, . . . , k, 

n 

which corresponds to the maximum of £^(77^^). To update the other parameters, we max- 
imize ^2(^2) t>y a Fisher-scoring algorithm that we illustrate in the following. 

The Fisher-scoring algorithm alternates a step in which the parameter vector 7 is 
updated with a step in which the parameter vector (f) is updated. The first step consists 
of adding to the current value of each free 7^ the ratio S2j//|j, where denotes the score 

for -^2(^2) with respect to and denotes the corresponding information computed at 
the current value of the parameters. These have the following expressions: 



c j 

f2j = 5^m,^(Z,,0)%[diag(A,,)-A,,Ayi?,,(Z,,0), 

c j 

where Rcj is the derivative matrix of the canonical parameter vector for X^j with respect 
to the vector of logits in (13); see Colombi and Forcina (2001). Then, the parameter 
vector <p is updated by adding the quantity (i^2)~^*2' where S2 is the score vector for 
£2(^2) with respect to 4> and denotes the corresponding information computed at the 
current parameter value, which have the following expressions: 

c j 

F; = 5^m,^7|^c,i^c,[diag(Ae,)-A,,Ayi2,,Ze,. 



As usual, we suggest to initialize the EM algorithm by a deterministic rule and by 
a multi-start strategy based on random starting values which are suitable generated. In 
this way we can deal with the multimodality of the model likelihood. 
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4.2 Model selection 



The formulation of a specific model in the class of multidimensional LC IRT models 
for ordinal responses univocally depends on: (i) the number of latent classes (k); (ii) 
the adopted parameterization in terms of link function Qxi-) and constraints on the item 
parameters 7^- and Pjx, and [iii) the number (s) of latent dimensions and the corresponding 
allocation of items within each dimension {Sj^, j = 1, . . . ,r, d = 1, . . . , s). Thus, the model 
selection implies the adoption of a number of choices, for each of the previously mentioned 
aspects, by using suitable criteria. In the following, we mainly refer on the likelihood ratio 
(LR) test and on the Bayesian (Schwarz, 1978) information criterion (BIG). Firstly, we 
briefly recall these methods; then, we illustrate in detail the suggested model selection 
procedure. 

4.2.1 Criteria for model selection 

As it is well known, given a certain hypothesis denoted by Hq, the LR test is based on 
the statistic 

D = -2{io-i), 

where Iq and i denote the maximum of log- likelihood of the reduced model which incorpo- 
rates Hq and under the general model, respectively. Under this hypothesis, and provided 
that suitable regularity conditions hold, LR statistics is asymptotically distributed as a 
Xq , where q is given by the difference in the number of parameters between the two nested 
models being compared. An asymptotically equivalent alternative to LR test, is the Wald 
test, which, however, requires to compute the information matrix of the model. 

Differently from the LR (and the Wald) statistics, information criteria do not provide 
neither a test of a model in the usual sense of testing a null hypothesis nor information 
about the way a model fits the data in absolute terms. However, they offer a relative 
measure of lost information when a given model is used to describe observed data. Besides, 
they are particularly useful to select among two or more general models, especially non- 
nested models, that cannot be compared by means of LR or Wald tests. 

Different types of information criteria have been proposed in the statistical literature, 
and among them we prefer the Bayesian Information Crierion (BIG, Schwarz, 1978), 
which is based on introducing a penalty term in the model to take into account the 
number of parameters. More precisely, this criterion is based on the index: 

BIC = -2i + log(n)#par, 

where i is the maximum value of the log-likelihood of the model of interest, and #par 
is the number of free parameters defined in Table 3. The smallest the BIG index is, the 
better is the model fitting. Therefore, among a set of competing models, we choose that 
with the minimum BIG value. 

BIG has to be preferred to other information criteria, because it satisfies some nice 
properties. Mainly, under certain regularity conditions it is asymptotically consistent 
(?). Moreover, since it applies a larger penalty for additional parameters (for reasonable 
sample sizes) in comparison with other criteria, BIG tends to select more parsimonious 
models. 
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4.2.2 Model selection procedure 



As stressed at the beginning of this section, the specification of a multidimensional LC 
IRT model for ordinal items implies a number of choices. A model selection procedure is 
here proposed which is based on the following sequence of ordered steps: 

1. selection of the optimal number k of latent classes; 

2. selection of the type of link function; 

3. selection of the number of latent dimensions and item allocation within each dimen- 
sion; 

4. selection of constraints on the item discriminating and difficulty parameters. 
These steps are described in more detail in the following. 

1. Selection of the number of latent classes. To detect the optimal number k of latent 
classes, it is useful to proceed by comparing models that differ only in the number 
of latent classes, all other features being equal. More precisely, we suggest to adopt 
the standard LC model (Goodman, 1974), characterized by one dimension for each 
item. In this way, no choice on the link function and the item parameterization is 
requested; also, any restrictive assumptions on item dimensionality is avoided. 

To compare LC models we rely on BIC, as it is not feasible to compare LC models 
with different number of latent classes through an LR test statistic. In particular, 
we fit the LC model with increasing k values; then, the value just before the first 
increasing BIC index is taken as optimal number of latent classes. 

A crucial problem with LC models is represented by the multimodality of the like- 
lihood function. To avoid that the choice of k falls in correspondence of a local - 
rather than global - maximum point, we suggest to repeat the estimation process by 
randomly varying the starting values of the model parameters. Then, for each possi- 
ble value of k, we select the highest obtained log-likelihood value and, consequently, 
the smallest estimated BIC value. 

2. Selection of the logit link function. As described in Section 2, it is possible to choose 
among three different types of logit: local logits, global logits, and continuation ratio 
logits. In particular, we perform the comparison between models on the basis of the 
mentioned logit functions and adopting BIC, which is here preferred to the LR test 
statistic as the latter cannot be validly used when models are not nested. Besides, 
when comparing the models, we choose the number of latent classes as selected in the 
previous step and we adopt the same multidimensional latent structure, that is with 
one dimension for each item. As concerns the item parameterization, we suggest 
to choose the most general one, which is based on both free item discriminating 
parameters and on free item difficulties parameters. 

Obviously, since it can happen that no relevant difference in the goodness of fit of 
the competing models comes out (i.e., BIC index assumes very similar values), the 
choice of the type of logit should also take into account the different interpretations 
behind the three types of logits; see also Maydeu-Olivares et al. (1994) and Samejima 
(1996). 
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3. Selection of dimensions. Detection of latent traits is of main interest when estimat- 
ing multidimensional IRT models. Several authors have dealt with testing unidi- 
mensionality in connection with Rasch type models. One of the main contributions 
is due to Martin-Lof (1973), who developed an LR test for the unidimensionality 
assumption against the alternative that the items consist of two subsets, defined in 
advance, each measuring one latent trait. This test has been generalized through 
a conditional non-parametric approach by Christensen et al. (2002) to the case of 
polytomous items and to cases with more than two dimensions. 

To the aim of detecting latent traits in a more general context than that of Rasch 
type models, the LR statistic may be used to test the unidimensionality of a set 
of items against a specific multidimensional alternative, being the null hypothesis 
specialized as 

Ho '■ Odic = 0'did2 + bdid2(^d2c, V(ii ^ d2 = 1, . . . , s, 

for two constants ad^d2 and bdj^d2, where the second is equal to 1 if the parametrization 
based on the constant discrimination indices is assumed. 

For instance, in the case of two dimensions, we compare a model in which these 
dimensions are collapsed (unidimensionality assumption) with a model in which 
they are kept distinct (bidimensionality assumption), all other elements being equal, 
in accordance with the results of the previous steps. On the basis of this principle, 
Bartolucci (2007) proposes a model-based hierarchical clustering procedure that can 
also be applied for the extended models here proposed to take into account ordinal 
items and that allows us to detect groups of items that measure the same latent 
trait. 

4. Choice of the item discriminating and difficulty parameterization. This step con- 
sists of the choice of the possible constraints on the discriminating and difficulty 
parameters. Four different types of model may be defined by combining free or 
constrained parameters with free or constrained (5jx parameters. Once the other 
elements of the model have been defined through the previous steps, we may per- 
form the comparison among the four models on the basis of the LR (or Wald) test. 
Indeed, the null hypothesis i^o we are testing when we compare a model with free 
7j with a model with constrained jj is the same as that expressed in (2). Similarly, 
by decomposing the item difficulty parameters as sum of two components, that is 

= /3j + Tjx, where Tjx is referred to item j and category x, and maintaining the 
same assumption about discriminating parameters, we easily realize that hypothesis 
(3) is equivalent to 

Hq : Tjx = Tj, j = 1 , . . . , r, X = 1 — 1 , 
which can be still tested by an LR statistic. 
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5 Application to measurement of anxiety and depres- 
sion 



The data used to illustrate the proposed class of polytomous LC IRT models concerns a 
sample of 201 oncological Italian patients who were asked to fill in questionnaires about 
their health and perceived quality of life. Here we are interested in anxiety and depression, 
as assessed by the "Hospital Anxiety and Depression Scale" (HADS) developed by Zig- 
mond and Snaith (1983). The questionnaire is composed by 14 polytomous items equally 
divided between the two dimensions: 

1. anxiety (7 items: 2, 6, 7, 8, 10, 11, 12); 

2. depression (7 items: 1, 3, 4, 5, 9, 13, 14). 

Apparently, within this context of study, the assumption of unidimensionality might 
be not realistic. Thus, the adoption of the proposed class of models, rather than a 
unidimensional IRT model, appears more suitable and well more convenient, as it allows 
to detect homogeneous classes of individuals who have similar latent characteristics, so 
that patients in the same class will receive the same clinical treatment. 

All items of the HADS questionnaire have four response categories: the minimum 
value corresponds to a low level of anxiety or depression, whereas the maximum value 
3 corresponds to a high level of anxiety or depression. Table 4 shows the distribution 
of item responses among the four categories, distinguishing between the two supposed 
dimensions. 





Response category 




item 





1 


2 


3 


Total 


2 


35.3 


52.7 


8.0 


4.0 


100.0 


6 


39.8 


46.3 


10.0 


4.0 


100.0 


7 


46.3 


22.4 


21.9 


9.5 


100.0 


8 


19.4 


49.3 


24.9 


6.5 


100.0 


10 


7.0 


40.8 


44.3 


8.0 


100.0 


11 


30.8 


49.8 


11.4 


8.0 


100.0 


12 


34.3 


46.3 


14.9 


4.5 


100.0 


Anxiety 


30.4 


43.9 


19.3 


6.3 


100.0 




43.8 


32.8 


16.4 


7.0 


100.0 


3 


56.7 


29.9 


9.0 


4.5 


100.0 


4 


31.8 


54.7 


11.9 


1.5 


100.0 


5 


46.3 


38.8 


13.4 


1.5 


100.0 


9 


9.0 


27.9 


55.2 


8.0 


100.0 


13 


42.3 


42.3 


11.4 


4.0 


100.0 


14 


30.8 


37.3 


28.9 


3.0 


100.0 


Depression 


37.2 


37.7 


20.9 


4.2 


100.0 



Table 4: Distribution of HADS item responses (row percentage frequencies). 



Altogether, responses are mainly concentrated in categories and 1 both for anxiety 
and depression, whereas category 3, that denotes high levels of psychopathological dis- 
turbs, is selected less than 10% of the times for each item. By summing item responses, 
it is possible to obtain, for each patient, a score indicating a raw measure of anxiety and 
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depression: the closer the raw score is to the minimum value 0, the lower the level of anx- 
iety or depression is, and viceversa. The mean raw score observed for the entire sample is 
very similar through the two dimensions, being 7.11 for anxiety and 7.17 for depression 
(standard deviation is equal to 4.15 and 4.16, respectively). Correlation between scores 
on anxiety and scores on depression is very high; it is equal to 0.98. 

To proceed to the model selection, the four ordered steps suggested in Section 4.2.2 are 
followed. We recall that the first step consists of detecting the optimal number k of latent 
classes. To this aim, the standard LC model is employed and a comparison among models 
which differ by the number of latent classes is performed for k = 1,2,3,4. The results 
of this preliminary fitting are reported in Table 5, where, to avoid the multimodality 
problem, results are referred both to deterministic and to random starting values. 





Deterministic start 


Random start 


k 


i 


#par 


BIG 


£(max) 


#par 


BIG(min) 


i 


-3153.151 


42 


652y.U4U 


-3153.151 


42 


652y.U4U 


2 


-2814.635 


85 


6080.051 


-2814.635 


85 


6080.051 


3 


-2677.822 


128 


6034.468 


-2674.484 


128 


6027.791 


4 


-2645.435 


171 


6197.736 


-2608.570 


171 


6104.805 



Table 5: Standard LC models: log-likelihood (£), number of parameters, and BIC val- 
ues for k = 1,...,4 latent classes; in boldface is the smallest BIC value, selected with 
deterministic and random starts. 

On the basis of the adopted selection criterium, we choose = 3 as optimal number of 
latent classes as, in correspondence of this number of latent classes, the smallest estimated 
BIC value is observed, both with a deterministic and random initialization of the EM 
algorithm. 

As regard to the second step and the choice of the best logit link function, a comparison 
between a graded response type model and a partial credit type model is carried out by 
assuming k = 3 latent classes, free item discriminating and difficulties parameters, a 
completely general multidimensional structure for the data (i.e., r dimensions, one for 
each item), and basing the comparison on the BIC index. Note that the continuation 
ratio logit link function is not suitable in this context, because the item response process 
does not consist of a sequence of successive steps. Table 6 shows that a global logit link 
has to be preferred to a local logit link. Also, it can be observed that a graded response 
type model has a better fit than the standard LC model, as the BIC value observed for 
the former is smaller than that detected for the latter (see Table 5). 





Global logit 


Local logit 


i 


-2726.348 


-2741.321 


#par 


72 


72 


BIG 


5834.534 


5864.479 



Table 6: Graded response and partial credit type models with k = 3: log-likelihood (i), 
number of parameters, and BIC values; in boldface is the smallest BIC value. 
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Once we have chosen the global logit as the best link function, we carry on with 
the test of unidimensionality. An LR test is used to compare models which differ on 
account of their dimensional structure, all other elements being equal (i.e., free item 
discriminating and difficulty parameters), that is {i) a graded response model with r- 
dimensional structure, (ii) a graded response model with bidimensional structure (i.e., 
anxiety and depression), and {in) a graded response model with unidimensional structure 
(i.e., all the items belong to the same dimension). For the sake of completeness, log- 
likelihood and BIG values are also provided for each model considered. On account of 
both BIG and the LR test, the hypothesis of unidimensionality may be accepted (see 
Table 7). This result is coherent with a similar analysis performed on the same data by 
Bacci and Bartolucci (to appear), where item responses were dichotomized and a Rasch 
parameterization was adopted. 



Model 


i 


#par 


BIG 


Deviance 


p- value 


r-dimensional 


-2726.348 


72 


5834.534 






bidimensional 


-2731.249 


60 


5780.696 


9.802 


0.633 


unidimensional 


-2731.894 


59 


5776.682 


1.290 


0.256 



Table 7: r-dimensional, bidimensional, and unidimensional graded response models with 
k = 3: log-likelihood, number of parameters , BIC value, and LR test results (deviance and 
p-value); in boldface the smallest BIC value. 

As previously outlined, the choice of the number of parameters per item depends 
on both the presence of a constant/non-constant discriminating index (7^), and of a 
constant/non-constant threshold difficulty parameter [Pjx), for each item. In our applica- 
tion, this implies a comparison among four models, in accordance with the classification 
adopted in Table 1. The parameterization is chosen on account of the unidimensional 
data structure and the previously selected global logit link function. Besides, because the 
compared models are nested, the parameterization is selected on the basis of an LR test. 
Again, for the sake of completeness, log-likelihood and BIG values are also provided for 
each model considered. 

The analyses show (Table 8) that between GRM and RS-GRM, GRM has to be pre- 
ferred to RS-GRM, while between models GRM and IP-GRM, the latter has to be pre- 
ferred. Besides, as model IP-GRM has a better fit than model IP-RS-GRM, then IP- 
GRM has to be preferred model among the four considered, that is the graded response 
type model with free Pjx parameters and constant 7-,- parameters. Such a result is achieved 
by taking into account both the BIG criterium and the LR test. 

As the sequence of the previously described steps may be considered partly arguable, it 
can be also shown that the same results - in terms of link function, item parameterization 
and dimensionality choice - would have been obtained if each of such models were com- 
pared at once accounting for log-likelihood and BIG values as selection criteria. Indeed, 
Table 9 shows that the smallest BIG value is observed when selecting: (i) a global logit 
link function; (ii) constrained 7-,- parameters and free Pjx parameters, that is, a IP-GRM 
model; and {Hi) assuming a unidimensional structure for the data. 

The estimates of support points and probabilities ifc, c = 1,2, 3, under the selected 
unidimensional IP-GRM model are shown in Table 10. On the basis of these results, we 
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Model 


t 


#par 


BIG 


Deviance 


p- value 


GRM 
RS-GRM 
IP-GRM 
IP-RS-GRM 


-2731.894 
-2795.570 
-2741.285 
-2844.518 


59 
33 
46 
20 


5776.682 
5766.149 
5726.521 

5795.102 


127.353 (vs GRM) 
18.782 (vs GRM) 
206.467 (vs IP-GRM) 


0.000 
0.130 
0.000 



Table 8: Item parameters selection: log-likelihood, number of parameters, BIC values, and 
LR test results (deviance and p-value) between nested graded response models with k = 3 
and s = 1; in boldface the smallest BIC value. 



Dimensionality 


Item parameters 


Global logit 


Local logit 






e BIG 


e BIG 


r-dimensional 


free/constr. 
free/constr. 


free 
constrained 


-2726.347 5834.534 
-2815.568 5875.088 


-2741.321 5864.479 
-2836.766 5917.484 


bidimensional 


free 
constrained 

free 
constrained 


free 

free 
constrained 
constrained 


-2731,249 5780,696 
-2740,658 5735,875 
-2798,959 5778,230 
-2843,227 5803,127 


-2749,839 5817,877 
-2764,787 5784,132 
-2835,611 5851,534 
-2869,223 5855,120 


unidimensional 


free 
constrained 

free 
constrained 


free 

free 
constrained 
constrained 


-2731,894 5776,682 
-2741,285 5726,521 
-2795,570 5766,149 
-2844,518 5795,102 


-2750,214 5813,323 
-2765,129 5774,211 
-2833,179 5841,366 
-2870,178 5846,422 



Table 9: Log-likelihood and BIC values for the global and local logit link functions, taking 
into account the dimensional structure (r-dimensional /bidimensional/ unidimensional) 
and the item parameters (depending on whether they are free /constraint); in boldface is 
the smallest BIC value. 



conclude that patients who suffer from psychopatological disturbs are mostly represented 
in the first two classes, whereas only the 16.7% of the subjects belong to the third class. 
Furthermore, patients belonging to class 1 present the least severe conditions, whereas 
patients in class 3 present the worst conditions. 





Latent class c 


Dimension 


1 2 3 


Psychopatological disturbs 
Probability 


-0.776 1.183 3.418 
0.342 0.491 0.167 



Table 10: Estimated support points (ind probabilities ifc of latent classes for the unidi- 
mensional IP- CRM. 
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6 Concluding remarks 



In this article, we extend the class of multidimensional latent class (LC) Item Response 
Theory (IRT) models (Bartolucci, 2007) for dichotomously-scored items to the case of 
ordinal polytomously-scored items. The proposed models are formulated in a general 
way, so that several different parameterizations may be adopted for the distribution of 
the response variable, conditioned to the vector of latent traits. The classification criteria 
we use are based on three main elements: the type of link function, which may be based 
on global, local, or continuation ratio logits, the type of constraints on item discriminat- 
ing parameters, that may be completely free or kept all equal to one, and the type of 
constraints on item difficulty parameters, that may be formulated so that each item has 
different distances between consecutive response categories or in a more parsimonious way, 
where the distance between difficulty levels from category to category within each item 
is the same across all items. According to the way these criteria are combined, twelve 
possible parameterizations result, some of which are well-known in the psychometrical 
literature, such as those referred to the Graded Response Model (Samejima, 1969), the 
Partial Credit Model (Masters, 1982), and the Rating Scale Model (Andrich, 1978). 

The proposed class of models is more flexible in comparison with traditional formu- 
lations of IRT models, often based on restrictive assumptions, such as unidimensionality 
and normality of latent trait. In particular, the assumption of multidimensionality allows 
us to take more than one latent trait into account at the same time and to study the 
correlation between latent traits. Moreover, in the proposed class of models, no specific 
assumption about the distribution of latent traits is necessary, since a latent class ap- 
proach is adopted, in which the latent traits are represented by a random vector with 
a discrete distribution common to all subjects. In this way, subjects with similar latent 
traits are assigned to the same latent class, so as to detect homogeneous subpopulations 
of subjects. Moreover, the latent class approach presents a notable simplification from the 
computational point of view with respect to the case of continuous latent traits, where 
the marginal likelihood is characterized by a multidimensional integral difficult to treat. 

In order to make inference on the proposed model, we show how the log-likelihood 
may be efficiently maximized by the EM algorithm. We also propose a model selection 
procedure to choose the different features that contribute to define a specific multidimen- 
sional LC IRT model. In general, comparisons between different parameterizations are 
based on information criteria, in particular we rely on the Bayesian Information Criterion 
(Schwarz, 1978) or on likelihood ratio (or Wald) test, being this last tool useful in presence 
of nested models. First of all, we suggest to verify the reasonableness of the discreteness 
assumption by selecting the number of latent classes. In order to obtain a more parsimo- 
nious model, this first phase of the model selection should be performed with reference to 
the standard LC model. Then, given the selected number of latent classes and the most 
general parameterization about items and dimensionality, the choice among global, local 
or continuation ratio logit link functions may be performed, so that a graded response or 
a partial credit or a sequential model is selected. This phase should also take into account 
the interpretability of the type of logit with reference to the specific application problem. 
The next phase consists of choosing the number of latent dimensions and the allocation 
of items within each dimension. This phase may be more or less complex depending 
on a priori information about the dimensionality structure of the questionnaire. Finally, 



21 



possible constraints on the item discriminating and difficulty parameters are selected, by 
comparing nested models that are equal as concerns all the other elements. 

The class of multidimensional LC IRT models for ordinal items and the proposed 
model selection procedure are illustrated through an application to a dataset, which con- 
cerns the measurement of psychopathological disturbs (i.e., anxiety and depression) in 
oncological patients by using the Anxiety and Depression Scale of Zigmond and Snaith 
(1983). The results show that subjects can be classified in three latent classes, and the 
item responses can be explained by a graded response type model with items having 
the same discriminating power and different distances between consecutive response cate- 
gories. The bidimensionality assumption is rejected in favor of unidimensionality, so that 
all items of the questionnaire measure the same latent psychopathological disturb. 
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